Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 10430 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.4 MiB |
| Average record size in memory | 138.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 1 |
modular_ratio is highly correlated with ratio | High correlation |
weight is highly correlated with peak_number | High correlation |
peak_number is highly correlated with weight | High correlation |
ratio is highly correlated with modular_ratio | High correlation |
upper_margin is highly correlated with interlinear_spacing | High correlation |
modular_ratio is highly correlated with ratio | High correlation |
interlinear_spacing is highly correlated with upper_margin | High correlation |
ratio is highly correlated with modular_ratio | High correlation |
modular_ratio is highly correlated with ratio | High correlation |
ratio is highly correlated with modular_ratio | High correlation |
intercolumnar_distance is highly correlated with row_number | High correlation |
upper_margin is highly correlated with lower_margin and 4 other fields | High correlation |
lower_margin is highly correlated with upper_margin and 3 other fields | High correlation |
row_number is highly correlated with intercolumnar_distance and 1 other fields | High correlation |
modular_ratio is highly correlated with upper_margin and 4 other fields | High correlation |
interlinear_spacing is highly correlated with upper_margin and 4 other fields | High correlation |
weight is highly correlated with upper_margin | High correlation |
peak_number is highly correlated with upper_margin and 3 other fields | High correlation |
ratio is highly correlated with modular_ratio and 1 other fields | High correlation |
class is highly correlated with row_number | High correlation |
upper_margin is highly skewed (γ1 = 91.76218687) | Skewed |
interlinear_spacing is highly skewed (γ1 = 22.13946265) | Skewed |
Reproduction
| Analysis started | 2022-09-22 23:55:09.896442 |
|---|---|
| Analysis finished | 2022-09-22 23:56:06.946630 |
| Duration | 57.05 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 144 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0008524439118 |
| Minimum | -3.498799 |
|---|---|
| Maximum | 11.819916 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 4434 |
| Negative (%) | 42.5% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -3.498799 |
|---|---|
| 5-th percentile | -0.585651 |
| Q1 | -0.128929 |
| median | 0.043885 |
| Q3 | 0.204355 |
| 95-th percentile | 0.661077 |
| Maximum | 11.819916 |
| Range | 15.318715 |
| Interquartile range (IQR) | 0.333284 |
Descriptive statistics
| Standard deviation | 0.9914313305 |
|---|---|
| Coefficient of variation (CV) | 1163.045822 |
| Kurtosis | 40.47992399 |
| Mean | 0.0008524439118 |
| Median Absolute Deviation (MAD) | 0.172814 |
| Skewness | 2.512437908 |
| Sum | 8.89099 |
| Variance | 0.9829360831 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -3.498799 | 308 | 3.0% |
| 0.15498 | 286 | 2.7% |
| 0.080916 | 272 | 2.6% |
| 0.130292 | 271 | 2.6% |
| 0.117948 | 265 | 2.5% |
| -0.042522 | 265 | 2.5% |
| 0.019197 | 251 | 2.4% |
| 0.142636 | 241 | 2.3% |
| -0.00549 | 236 | 2.3% |
| 0.031541 | 228 | 2.2% |
| Other values (134) | 7807 |
| Value | Count | Frequency (%) |
| -3.498799 | 308 | |
| -3.486455 | 4 | < 0.1% |
| -3.461768 | 6 | 0.1% |
| -3.43708 | 6 | 0.1% |
| -3.412392 | 5 | < 0.1% |
| -3.054421 | 11 | 0.1% |
| -2.807544 | 6 | 0.1% |
| -2.573011 | 9 | 0.1% |
| -2.523635 | 18 | 0.2% |
| -2.47426 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 11.819916 | 8 | |
| 9.943651 | 9 | |
| 9.52396 | 9 | |
| 8.314263 | 8 | |
| 5.759087 | 7 | |
| 4.96908 | 10 | |
| 4.524702 | 9 | |
| 4.462983 | 16 | |
| 3.722352 | 12 | |
| 3.265629 | 16 |
| Distinct | 208 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.03361059195 |
| Minimum | -2.426761 |
|---|---|
| Maximum | 386 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 5922 |
| Negative (%) | 56.8% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -2.426761 |
|---|---|
| 5-th percentile | -0.620989 |
| Q1 | -0.259834 |
| median | -0.055704 |
| Q3 | 0.203385 |
| 95-th percentile | 0.572391 |
| Maximum | 386 |
| Range | 388.426761 |
| Interquartile range (IQR) | 0.463219 |
Descriptive statistics
| Standard deviation | 3.920868463 |
|---|---|
| Coefficient of variation (CV) | 116.6557396 |
| Kurtosis | 9007.991724 |
| Mean | 0.03361059195 |
| Median Absolute Deviation (MAD) | 0.227685 |
| Skewness | 91.76218687 |
| Sum | 350.558474 |
| Variance | 15.3732095 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.189174 | 206 | 2.0% |
| -0.291239 | 173 | 1.7% |
| -0.220579 | 157 | 1.5% |
| 0.069915 | 149 | 1.4% |
| 0.289748 | 143 | 1.4% |
| -0.338346 | 140 | 1.3% |
| -0.055704 | 136 | 1.3% |
| -0.071406 | 135 | 1.3% |
| 0.203385 | 134 | 1.3% |
| -0.063555 | 133 | 1.3% |
| Other values (198) | 8924 |
| Value | Count | Frequency (%) |
| -2.426761 | 130 | |
| -2.395356 | 14 | 0.1% |
| -2.08916 | 12 | 0.1% |
| -1.963541 | 13 | 0.1% |
| -1.947839 | 14 | 0.1% |
| -1.916434 | 16 | 0.2% |
| -1.680899 | 16 | 0.2% |
| -1.649494 | 8 | 0.1% |
| -1.304042 | 12 | 0.1% |
| -1.296191 | 13 | 0.1% |
| Value | Count | Frequency (%) |
| 386 | 1 | < 0.1% |
| 43.133656 | 1 | < 0.1% |
| 19.470188 | 3 | < 0.1% |
| 17.570202 | 4 | < 0.1% |
| 16.965662 | 2 | < 0.1% |
| 13.895848 | 3 | < 0.1% |
| 12.655362 | 8 | |
| 10.65331 | 6 | |
| 9.65621 | 11 | |
| 8.13308 | 1 | < 0.1% |
| Distinct | 231 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.0005253415149 |
| Minimum | -3.210528 |
|---|---|
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1737 |
| Negative (%) | 16.7% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -3.210528 |
|---|---|
| 5-th percentile | -3.210528 |
| Q1 | 0.064919 |
| median | 0.217845 |
| Q3 | 0.352988 |
| 95-th percentile | 0.537921 |
| Maximum | 50 |
| Range | 53.210528 |
| Interquartile range (IQR) | 0.288069 |
Descriptive statistics
| Standard deviation | 1.12020198 |
|---|---|
| Coefficient of variation (CV) | -2132.330966 |
| Kurtosis | 386.1347465 |
| Mean | -0.0005253415149 |
| Median Absolute Deviation (MAD) | 0.142256 |
| Skewness | 7.474409689 |
| Sum | -5.479312 |
| Variance | 1.254852476 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -3.210528 | 627 | 6.0% |
| 0.214288 | 143 | 1.4% |
| 0.239183 | 143 | 1.4% |
| 0.388552 | 135 | 1.3% |
| 0.14316 | 124 | 1.2% |
| 0.139604 | 121 | 1.2% |
| 0.306755 | 119 | 1.1% |
| 0.299642 | 119 | 1.1% |
| 0.352988 | 117 | 1.1% |
| 0.349432 | 115 | 1.1% |
| Other values (221) | 8667 |
| Value | Count | Frequency (%) |
| -3.210528 | 627 | |
| -3.206971 | 15 | 0.1% |
| -3.203415 | 32 | 0.3% |
| -3.075385 | 18 | 0.2% |
| -2.975805 | 23 | 0.2% |
| -2.958023 | 16 | 0.2% |
| -2.908234 | 12 | 0.1% |
| -2.481465 | 8 | 0.1% |
| -2.349878 | 18 | 0.2% |
| -2.324983 | 13 | 0.1% |
| Value | Count | Frequency (%) |
| 50 | 1 | < 0.1% |
| 7.458681 | 5 | |
| 7.419561 | 3 | < 0.1% |
| 6.381091 | 1 | < 0.1% |
| 6.260173 | 3 | < 0.1% |
| 5.49199 | 12 | |
| 5.196809 | 6 | |
| 5.083004 | 9 | |
| 4.329047 | 4 | < 0.1% |
| 3.941399 | 11 |
exploitation
Real number (ℝ)
| Distinct | 750 |
|---|---|
| Distinct (%) | 7.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.002386667881 |
| Minimum | -5.440122 |
|---|---|
| Maximum | 3.987152 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4859 |
| Negative (%) | 46.6% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -5.440122 |
|---|---|
| 5-th percentile | -1.815842 |
| Q1 | -0.528002 |
| median | 0.095763 |
| Q3 | 0.65821 |
| 95-th percentile | 1.401386 |
| Maximum | 3.987152 |
| Range | 9.427274 |
| Interquartile range (IQR) | 1.186212 |
Descriptive statistics
| Standard deviation | 1.008526572 |
|---|---|
| Coefficient of variation (CV) | -422.5667843 |
| Kurtosis | 3.342667849 |
| Mean | -0.002386667881 |
| Median Absolute Deviation (MAD) | 0.606907 |
| Skewness | -0.9248334086 |
| Sum | -24.892946 |
| Variance | 1.017125846 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -5.440122 | 43 | 0.4% |
| -0.527256 | 29 | 0.3% |
| -0.184417 | 24 | 0.2% |
| 0.139087 | 24 | 0.2% |
| 0.334258 | 23 | 0.2% |
| 0.711352 | 23 | 0.2% |
| -1.078156 | 23 | 0.2% |
| -1.981085 | 23 | 0.2% |
| -0.817372 | 22 | 0.2% |
| 0.955725 | 22 | 0.2% |
| Other values (740) | 10174 |
| Value | Count | Frequency (%) |
| -5.440122 | 43 | |
| -3.441837 | 6 | 0.1% |
| -3.018853 | 9 | 0.1% |
| -2.9863 | 4 | < 0.1% |
| -2.963951 | 16 | 0.2% |
| -2.941364 | 8 | 0.1% |
| -2.832246 | 16 | 0.2% |
| -2.809808 | 9 | 0.1% |
| -2.701227 | 12 | 0.1% |
| -2.636061 | 8 | 0.1% |
| Value | Count | Frequency (%) |
| 3.987152 | 13 | |
| 2.974359 | 1 | < 0.1% |
| 2.791392 | 9 | |
| 2.258633 | 12 | |
| 2.211191 | 13 | |
| 2.123586 | 15 | |
| 2.046336 | 14 | |
| 2.04189 | 16 | |
| 2.021451 | 3 | < 0.1% |
| 2.004055 | 17 |
| Distinct | 48 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.006369532215 |
| Minimum | -4.922215 |
|---|---|
| Maximum | 1.066121 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 1539 |
| Negative (%) | 14.8% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -4.922215 |
|---|---|
| 5-th percentile | -1.347089 |
| Q1 | 0.17234 |
| median | 0.261718 |
| Q3 | 0.261718 |
| 95-th percentile | 0.976743 |
| Maximum | 1.066121 |
| Range | 5.988336 |
| Interquartile range (IQR) | 0.089378 |
Descriptive statistics
| Standard deviation | 0.9920533183 |
|---|---|
| Coefficient of variation (CV) | 155.7497921 |
| Kurtosis | 14.80829878 |
| Mean | 0.006369532215 |
| Median Absolute Deviation (MAD) | 0.089378 |
| Skewness | -3.701355076 |
| Sum | 66.434221 |
| Variance | 0.9841697864 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=48)
| Value | Count | Frequency (%) |
| 0.261718 | 5210 | |
| 0.17234 | 1920 | 18.4% |
| 0.976743 | 565 | 5.4% |
| 0.082961 | 500 | 4.8% |
| -4.922215 | 265 | 2.5% |
| 0.351096 | 167 | 1.6% |
| -0.006417 | 131 | 1.3% |
| 0.887365 | 116 | 1.1% |
| -1.257711 | 112 | 1.1% |
| -1.078955 | 112 | 1.1% |
| Other values (38) | 1332 | 12.8% |
| Value | Count | Frequency (%) |
| -4.922215 | 265 | |
| -4.832837 | 10 | 0.1% |
| -4.743459 | 4 | < 0.1% |
| -4.654081 | 9 | 0.1% |
| -3.849677 | 6 | 0.1% |
| -3.313408 | 10 | 0.1% |
| -3.22403 | 13 | 0.1% |
| -3.134652 | 8 | 0.1% |
| -3.045274 | 19 | 0.2% |
| -2.777139 | 13 | 0.1% |
| Value | Count | Frequency (%) |
| 1.066121 | 41 | 0.4% |
| 0.976743 | 565 | 5.4% |
| 0.887365 | 116 | 1.1% |
| 0.797987 | 88 | 0.8% |
| 0.708609 | 92 | 0.9% |
| 0.61923 | 102 | 1.0% |
| 0.529852 | 35 | 0.3% |
| 0.440474 | 54 | 0.5% |
| 0.351096 | 167 | 1.6% |
| 0.261718 | 5210 |
| Distinct | 226 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.01397286385 |
| Minimum | -7.450257 |
|---|---|
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 5482 |
| Negative (%) | 52.6% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -7.450257 |
|---|---|
| 5-th percentile | -1.429155 |
| Q1 | -0.598658 |
| median | -0.058835 |
| Q3 | 0.564038 |
| 95-th percentile | 1.685209 |
| Maximum | 53 |
| Range | 60.450257 |
| Interquartile range (IQR) | 1.162696 |
Descriptive statistics
| Standard deviation | 1.126245103 |
|---|---|
| Coefficient of variation (CV) | 80.60230992 |
| Kurtosis | 470.3351746 |
| Mean | 0.01397286385 |
| Median Absolute Deviation (MAD) | 0.581347 |
| Skewness | 10.15556663 |
| Sum | 145.73697 |
| Variance | 1.268428032 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.107265 | 225 | 2.2% |
| -0.141884 | 221 | 2.1% |
| 0.024215 | 220 | 2.1% |
| -0.058835 | 214 | 2.1% |
| -0.474083 | 210 | 2.0% |
| -0.224934 | 203 | 1.9% |
| -0.307984 | 198 | 1.9% |
| -0.349509 | 194 | 1.9% |
| -0.10036 | 194 | 1.9% |
| -0.432558 | 193 | 1.9% |
| Other values (216) | 8358 |
| Value | Count | Frequency (%) |
| -7.450257 | 3 | |
| -4.418943 | 1 | < 0.1% |
| -3.920645 | 1 | < 0.1% |
| -3.87912 | 1 | < 0.1% |
| -3.837595 | 2 | |
| -3.796071 | 1 | < 0.1% |
| -3.713021 | 1 | < 0.1% |
| -3.671496 | 2 | |
| -3.588447 | 2 | |
| -3.546922 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 53 | 1 | |
| 5.505495 | 1 | |
| 5.007196 | 1 | |
| 4.882622 | 2 | |
| 4.716523 | 1 | |
| 4.591948 | 1 | |
| 4.550423 | 1 | |
| 4.508898 | 2 | |
| 4.342799 | 1 | |
| 4.259749 | 2 |
| Distinct | 229 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.005605411505 |
| Minimum | -11.935457 |
|---|---|
| Maximum | 83 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 3099 |
| Negative (%) | 29.7% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -11.935457 |
|---|---|
| 5-th percentile | -1.591843 |
| Q1 | -0.044076 |
| median | 0.220177 |
| Q3 | 0.446679 |
| 95-th percentile | 0.824183 |
| Maximum | 83 |
| Range | 94.935457 |
| Interquartile range (IQR) | 0.490755 |
Descriptive statistics
| Standard deviation | 1.313754284 |
|---|---|
| Coefficient of variation (CV) | 234.3724958 |
| Kurtosis | 1541.044209 |
| Mean | 0.005605411505 |
| Median Absolute Deviation (MAD) | 0.226503 |
| Skewness | 22.13946265 |
| Sum | 58.464442 |
| Variance | 1.725950319 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.295677 | 512 | 4.9% |
| 0.144676 | 473 | 4.5% |
| 0.182426 | 465 | 4.5% |
| 0.333428 | 451 | 4.3% |
| 0.257927 | 450 | 4.3% |
| 0.220177 | 433 | 4.2% |
| 0.446679 | 390 | 3.7% |
| 0.371178 | 388 | 3.7% |
| 0.408929 | 383 | 3.7% |
| 0.069175 | 381 | 3.7% |
| Other values (219) | 6104 |
| Value | Count | Frequency (%) |
| -11.935457 | 17 | |
| -9.670432 | 1 | < 0.1% |
| -8.990925 | 1 | < 0.1% |
| -8.953175 | 1 | < 0.1% |
| -8.915424 | 1 | < 0.1% |
| -8.839923 | 1 | < 0.1% |
| -8.688922 | 1 | < 0.1% |
| -8.57567 | 1 | < 0.1% |
| -8.059748 | 1 | < 0.1% |
| -7.103404 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 83 | 1 | |
| 10.714792 | 1 | |
| 8.902772 | 1 | |
| 7.392756 | 1 | |
| 5.12773 | 1 | |
| 4.523724 | 1 | |
| 3.617714 | 1 | |
| 3.579963 | 1 | |
| 3.466712 | 1 | |
| 3.315711 | 2 |
| Distinct | 10114 |
|---|---|
| Distinct (%) | 97.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.01032255321 |
| Minimum | -4.247781 |
|---|---|
| Maximum | 13.173081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4684 |
| Negative (%) | 44.9% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -4.247781 |
|---|---|
| 5-th percentile | -1.7781679 |
| Q1 | -0.5419915 |
| median | 0.111803 |
| Q3 | 0.65494425 |
| 95-th percentile | 1.443792 |
| Maximum | 13.173081 |
| Range | 17.420862 |
| Interquartile range (IQR) | 1.19693575 |
Descriptive statistics
| Standard deviation | 1.003507085 |
|---|---|
| Coefficient of variation (CV) | 97.21500723 |
| Kurtosis | 3.844505272 |
| Mean | 0.01032255321 |
| Median Absolute Deviation (MAD) | 0.592267 |
| Skewness | -0.3540198348 |
| Sum | 107.66423 |
| Variance | 1.00702647 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.008699 | 3 | < 0.1% |
| 0.551086 | 3 | < 0.1% |
| 0.489209 | 3 | < 0.1% |
| -0.389455 | 3 | < 0.1% |
| 0.282396 | 3 | < 0.1% |
| 0.805933 | 2 | < 0.1% |
| 0.465251 | 2 | < 0.1% |
| 0.587587 | 2 | < 0.1% |
| -1.048976 | 2 | < 0.1% |
| 0.538214 | 2 | < 0.1% |
| Other values (10104) | 10405 |
| Value | Count | Frequency (%) |
| -4.247781 | 1 | |
| -4.164819 | 1 | |
| -4.090691 | 1 | |
| -4.064461 | 1 | |
| -4.011262 | 1 | |
| -3.970141 | 1 | |
| -3.966219 | 1 | |
| -3.931854 | 1 | |
| -3.838349 | 1 | |
| -3.802217 | 1 |
| Value | Count | Frequency (%) |
| 13.173081 | 1 | |
| 4.510897 | 1 | |
| 3.987439 | 1 | |
| 3.817748 | 1 | |
| 3.557736 | 1 | |
| 3.556843 | 1 | |
| 3.545252 | 1 | |
| 3.479784 | 1 | |
| 3.449962 | 1 | |
| 3.42379 | 1 |
| Distinct | 261 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.01291409904 |
| Minimum | -5.486218 |
|---|---|
| Maximum | 44 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4593 |
| Negative (%) | 44.0% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -5.486218 |
|---|---|
| 5-th percentile | -1.900349 |
| Q1 | -0.372457 |
| median | 0.064084 |
| Q3 | 0.500624 |
| 95-th percentile | 1.591976 |
| Maximum | 44 |
| Range | 49.486218 |
| Interquartile range (IQR) | 0.873081 |
Descriptive statistics
| Standard deviation | 1.087665481 |
|---|---|
| Coefficient of variation (CV) | 84.22310198 |
| Kurtosis | 257.6364546 |
| Mean | 0.01291409904 |
| Median Absolute Deviation (MAD) | 0.43654 |
| Skewness | 5.669546108 |
| Sum | 134.694053 |
| Variance | 1.183016198 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.001721 | 249 | 2.4% |
| 0.126447 | 241 | 2.3% |
| -0.060642 | 239 | 2.3% |
| 0.095265 | 223 | 2.1% |
| 0.18881 | 221 | 2.1% |
| -0.154186 | 220 | 2.1% |
| 0.064084 | 215 | 2.1% |
| 0.157628 | 212 | 2.0% |
| -0.123005 | 205 | 2.0% |
| -0.029461 | 205 | 2.0% |
| Other values (251) | 8200 |
| Value | Count | Frequency (%) |
| -5.486218 | 1 | |
| -5.423855 | 1 | |
| -5.049677 | 1 | |
| -4.488411 | 2 | |
| -4.394866 | 1 | |
| -4.238959 | 1 | |
| -4.176596 | 1 | |
| -4.114233 | 2 | |
| -4.05187 | 2 | |
| -4.020689 | 1 |
| Value | Count | Frequency (%) |
| 44 | 1 | < 0.1% |
| 3.244594 | 1 | < 0.1% |
| 3.182231 | 1 | < 0.1% |
| 2.963961 | 1 | < 0.1% |
| 2.901598 | 2 | |
| 2.870416 | 4 | |
| 2.839235 | 2 | |
| 2.808053 | 1 | < 0.1% |
| 2.776872 | 1 | < 0.1% |
| 2.74569 | 3 |
| Distinct | 9975 |
|---|---|
| Distinct (%) | 95.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0008177224353 |
| Minimum | -6.719324 |
|---|---|
| Maximum | 4.671232 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 5393 |
| Negative (%) | 51.7% |
| Memory size | 81.6 KiB |
Quantile statistics
| Minimum | -6.719324 |
|---|---|
| 5-th percentile | -1.40464965 |
| Q1 | -0.51609725 |
| median | -0.034513 |
| Q3 | 0.53085475 |
| 95-th percentile | 1.61266485 |
| Maximum | 4.671232 |
| Range | 11.390556 |
| Interquartile range (IQR) | 1.046952 |
Descriptive statistics
| Standard deviation | 1.007093948 |
|---|---|
| Coefficient of variation (CV) | 1231.584099 |
| Kurtosis | 5.8080618 |
| Mean | 0.0008177224353 |
| Median Absolute Deviation (MAD) | 0.518865 |
| Skewness | -0.7257669311 |
| Sum | 8.528845 |
| Variance | 1.014238221 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.691759 | 22 | 0.2% |
| -6.719324 | 19 | 0.2% |
| 0.815132 | 7 | 0.1% |
| -0.103698 | 5 | < 0.1% |
| -0.337194 | 5 | < 0.1% |
| -0.053548 | 5 | < 0.1% |
| -0.089002 | 5 | < 0.1% |
| -0.271228 | 4 | < 0.1% |
| -0.615462 | 4 | < 0.1% |
| -1.19406 | 4 | < 0.1% |
| Other values (9965) | 10350 |
| Value | Count | Frequency (%) |
| -6.719324 | 19 | |
| -6.092301 | 1 | < 0.1% |
| -5.907917 | 1 | < 0.1% |
| -5.714734 | 1 | < 0.1% |
| -5.543214 | 1 | < 0.1% |
| -5.534124 | 1 | < 0.1% |
| -5.476139 | 1 | < 0.1% |
| -5.460407 | 1 | < 0.1% |
| -5.387654 | 1 | < 0.1% |
| -5.079477 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 4.671232 | 1 | |
| 4.443329 | 1 | |
| 4.281308 | 1 | |
| 4.048692 | 1 | |
| 4.034684 | 1 | |
| 4.03337 | 1 | |
| 3.979604 | 1 | |
| 3.90426 | 1 | |
| 3.871867 | 1 | |
| 3.752847 | 1 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 590.9 KiB |
| A | |
|---|---|
| F | |
| E | |
| I | |
| X | |
| Other values (7) |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 10430 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | A |
|---|---|
| 2nd row | A |
| 3rd row | A |
| 4th row | A |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| A | 4286 | |
| F | 1961 | |
| E | 1095 | 10.5% |
| I | 831 | 8.0% |
| X | 522 | 5.0% |
| H | 519 | 5.0% |
| G | 446 | 4.3% |
| D | 352 | 3.4% |
| Y | 266 | 2.6% |
| C | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| a | 4286 | |
| f | 1961 | |
| e | 1095 | 10.5% |
| i | 831 | 8.0% |
| x | 522 | 5.0% |
| h | 519 | 5.0% |
| g | 446 | 4.3% |
| d | 352 | 3.4% |
| y | 266 | 2.6% |
| c | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 4286 | |
| F | 1961 | |
| E | 1095 | 10.5% |
| I | 831 | 8.0% |
| X | 522 | 5.0% |
| H | 519 | 5.0% |
| G | 446 | 4.3% |
| D | 352 | 3.4% |
| Y | 266 | 2.6% |
| C | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 10430 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 4286 | |
| F | 1961 | |
| E | 1095 | 10.5% |
| I | 831 | 8.0% |
| X | 522 | 5.0% |
| H | 519 | 5.0% |
| G | 446 | 4.3% |
| D | 352 | 3.4% |
| Y | 266 | 2.6% |
| C | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10430 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 4286 | |
| F | 1961 | |
| E | 1095 | 10.5% |
| I | 831 | 8.0% |
| X | 522 | 5.0% |
| H | 519 | 5.0% |
| G | 446 | 4.3% |
| D | 352 | 3.4% |
| Y | 266 | 2.6% |
| C | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10430 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 4286 | |
| F | 1961 | |
| E | 1095 | 10.5% |
| I | 831 | 8.0% |
| X | 522 | 5.0% |
| H | 519 | 5.0% |
| G | 446 | 4.3% |
| D | 352 | 3.4% |
| Y | 266 | 2.6% |
| C | 103 | 1.0% |
| Other values (2) | 49 | 0.5% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| intercolumnar_distance | upper_margin | lower_margin | exploitation | row_number | modular_ratio | interlinear_spacing | weight | peak_number | ratio | class | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.266074 | -0.165620 | 0.320980 | 0.483299 | 0.172340 | 0.273364 | 0.371178 | 0.929823 | 0.251173 | 0.159345 | A |
| 1 | 0.130292 | 0.870736 | -3.210528 | 0.062493 | 0.261718 | 1.436060 | 1.465940 | 0.636203 | 0.282354 | 0.515587 | A |
| 2 | -0.116585 | 0.069915 | 0.068476 | -0.783147 | 0.261718 | 0.439463 | -0.081827 | -0.888236 | -0.123005 | 0.582939 | A |
| 3 | 0.031541 | 0.297600 | -3.210528 | -0.583590 | -0.721442 | -0.307984 | 0.710932 | 1.051693 | 0.594169 | -0.533994 | A |
| 4 | 0.229043 | 0.807926 | -0.052442 | 0.082634 | 0.261718 | 0.148790 | 0.635431 | 0.051062 | 0.032902 | -0.086652 | F |
| 5 | 0.117948 | -0.220579 | -3.210528 | -1.623238 | 0.261718 | -0.349509 | 0.257927 | -0.385979 | -0.247731 | -0.331310 | A |
| 6 | 0.389513 | -0.220579 | -3.210528 | -2.624155 | 0.261718 | -0.764757 | 0.484429 | -0.597510 | -0.372457 | -0.810261 | A |
| 7 | 0.019197 | -0.040001 | 0.288973 | -0.042597 | 0.261718 | -1.013906 | 0.069175 | 0.890701 | 0.095265 | -0.842014 | F |
| 8 | 0.500607 | 0.140576 | 0.388552 | -0.637358 | 0.261718 | -0.681707 | 0.295677 | 0.931046 | 0.500624 | -0.642297 | H |
| 9 | -0.252367 | 0.069915 | 0.246296 | 0.523550 | 0.261718 | -1.221530 | 0.899684 | 1.373076 | 0.625350 | -1.400890 | E |
Last rows
| intercolumnar_distance | upper_margin | lower_margin | exploitation | row_number | modular_ratio | interlinear_spacing | weight | peak_number | ratio | class | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 10420 | -0.005490 | 0.478177 | 0.029355 | -0.247644 | 0.172340 | 0.605563 | 0.673182 | -0.951919 | -0.528364 | 0.286973 | A |
| 10421 | 0.241386 | 0.234790 | 0.121822 | 1.037988 | 0.261718 | 0.647088 | 0.182426 | 0.684936 | 0.219991 | 0.628422 | A |
| 10422 | -0.277055 | -0.251983 | -3.203415 | 1.957926 | 0.261718 | 1.892833 | 0.635431 | 1.898205 | 2.184424 | 1.427425 | X |
| 10423 | 4.969080 | -0.385453 | 0.143160 | -2.600732 | 0.976743 | -0.764757 | -0.232828 | -2.348488 | -1.183175 | -0.459372 | I |
| 10424 | 0.216699 | 0.321153 | 0.128935 | 0.491087 | 0.261718 | 0.439463 | 0.069175 | 0.252846 | 0.188810 | 0.482857 | A |
| 10425 | 0.080916 | 0.588093 | 0.015130 | 0.002250 | 0.261718 | -0.557133 | 0.371178 | 0.932346 | 0.282354 | -0.580141 | F |
| 10426 | 0.253730 | -0.338346 | 0.352988 | -1.154243 | 0.172340 | -0.557133 | 0.257927 | 0.348428 | 0.032902 | -0.527134 | F |
| 10427 | 0.229043 | -0.000745 | 0.171611 | -0.002793 | 0.261718 | 0.688613 | 0.295677 | -1.088486 | -0.590727 | 0.580142 | A |
| 10428 | -0.301743 | 0.352558 | 0.288973 | 1.638181 | 0.261718 | 0.688613 | 0.069175 | 0.502761 | 0.625350 | 0.718969 | E |
| 10429 | -0.104241 | -1.037102 | 0.388552 | -1.099311 | 0.172340 | -0.307984 | 0.786433 | -1.337547 | 0.999528 | -0.551063 | X |